Goto

Collaborating Authors

 soft cap


Methods of improving LLM training stability

Rybakov, Oleg, Chrzanowski, Mike, Dykas, Peter, Xue, Jinze, Lanir, Ben

arXiv.org Artificial Intelligence

Training stability of large language models(LLMs) is an important research topic. Reproducing training instabilities can be costly, so we use a small language model with 830M parameters and experiment with higher learning rates to force models to diverge. One of the sources of training instability is the growth of logits in attention layers. We extend the focus of the previous work and look not only at the magnitude of the logits but at all outputs of linear layers in the Transformer block. We observe that with a high learning rate the L2 norm of all linear layer outputs can grow with each training step and the model diverges. Specifically we observe that QKV, Proj and FC2 layers have the largest growth of the output magnitude. This prompts us to explore several options: 1) apply layer normalization not only after QK layers but also after Proj and FC2 layers too; 2) apply layer normalization after the QKV layer (and remove pre normalization). 3) apply QK layer normalization together with softmax capping. We show that with the last two methods we can increase learning rate by 1.5x (without model divergence) in comparison to an approach based on QK layer normalization only. Also we observe significant perplexity improvements for all three methods in comparison to the baseline model.


Deformable Tip Mount for Soft Growing Eversion Robots

Suulker, Cem, Skach, Sophie, Kaleel, Danyaal, Abrar, Taqi, Murtaza, Zain, Suulker, Dilara, Althoefer, Kaspar

arXiv.org Artificial Intelligence

Here we present a flexible tip mount for eversion (vine) robots. This soft cap allows attaching a payload to an eversion robot while allowing moving through narrow openings, as well as the eversion of protruding objects, and expanded surfaces.

  Country: Europe > United Kingdom > England > Greater London > London (0.05)
  Genre: Research Report (0.51)

Soft Cap for Eversion Robots

Suulker, Cem, Skach, Sophie, Kaleel, Danyaal, Abrar, Taqi, Murtaza, Zain, Suulker, Dilara, Althoefer, Kaspar

arXiv.org Artificial Intelligence

Growing robots based on the eversion principle are known for their ability to extend rapidly, from within, along their longitudinal axis, and, in doing so, reach deep into hitherto inaccessible, remote spaces. Despite many advantages, eversion robots also present significant challenges, one of which is maintaining sensory payload at the tip without restricting the eversion process. A variety of tip mechanisms has been proposed by the robotics community, among them rounded caps of relatively complex construction that are not always compatible with functional hardware, such as sensors or navigation pouches, integrated with the main eversion structure. Moreover, many tip designs incorporate rigid materials, reducing the robot's flexibility and consequent ability to navigate through narrow openings. Here, we address these shortcomings and propose a design to overcome them: a soft, entirely fabric based, cylindrical cap that can easily be slipped onto the tip of eversion robots. Having created a series of caps of different sizes and materials, an experimental study was conducted to evaluate our new design in terms of four key aspects: eversion robot made from multiple layers of everting material, solid objects protruding from the eversion robot, squeezability, and navigability. In all scenarios, we can show that our soft, flexible cap is robust in its ability to maintain its position and is capable of transporting payloads such as a camera across long distances.